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ABSTRACT 
We  propose  Nadaraya-Watson  type  nonparametric  estimation  of  the 
conditional  expectation  of  the  dependent  variable  as  a  means  of  com- 
puting analytical  partial  derivatives  (e.g.,  response  coefficient, 
elasticity)  with  respect  to  appropriate  variables.   An  illustrative 
example  concerns  the  effect  of  age  on  earnings. 
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1.   INTRODUCTION  AND  THE  MODEL 


Let  us  consider  an  amorphous  specification 


y=R(x.,...,x)+u=E(y:x_,...,x)+u  (1) 

1      P  !      P 

where  y  is  an  n  x  1  vector  of  observations  on  the  dependent  variable, 

x,,...,x   are  each  n  x  1  vector  of  observations  on  p  regressors,  u  is 

an  n  x  1  vector  of  errors  and  the  regression  function  R  =  R(  )  is  an 

unspecified  expectation  of  y  conditional  on  x  to  x  denoted  by 

1     P 

E(y:x.,...,x  ).   In  the  usual  parametric  econometrics,  one  considers  R 

as  a  linear  function  of  x,  to  x  ,  R  =  x.B,  +  ...  +  x  8  where  8.,  j  = 

1     p       11         p  p        j 

l,...,p,  is  interpreted  as  the  partial  derivative  (regression  coef- 
ficient) of  y  with  respect  to  x..   Recent  research  has  focused  on 
achieving  greater  economic  realism  by  using  flexible  functional  forms, 
see  e.g.,  Barnett  (1984)  and  Elbadawi  et.  al.,  (1983). 

In  this  paper  we  directly  estimate  the  conditional  expectation  R  = 

E(y:x,,...,x  )  while  its  partial  derivatives  with  respect  to  x.  for  i 
1      P  J 

=  l,...,p.   The  conditional  expectation  is  estimated  by  the  non- 
parametric  Kernel  method,  and  the  estimation  of  the  analytic  partial 
derivatives  appears  to  be  new.   This  nonparametric  approach  to  calcu- 
lating partial  derivatives  has  several  advantages  compared  to  the 
usual  parametric  approach.   First,  it  does  not  require  any  a  priori 
assumption  about  the  functional  form.   Second,  x's  are  considered  to 
be  stochastic  as  is  the  case  in  the  nonexperimental  subject  like 
economics.   Third,  it  does  not  require  any  assumption  about  the  data 
generating  process  (joint  density  of  y,  x.,...,x  ). 
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2.   ESTIMATION  OF  PARTIAL  DERIVATIVES 

Let  x  ,  t  ■  l,...n  be  n  independent  and  identically  distributed 

random  vectors  generated  from  an  unknown  m-variate  density  function 

f(x.,...,x  ).   Consider  K  to  be  a  class  of  all  Borel-measurable  real 
1      m 

valued  bounded  functions  k  on  the  m-dimensional  Euclidian  space 
R  such  that 


/k(w)dw  =1,      / |k(w) |dw  <  » 

llwllm|k(w)  |    +  0  as    llwll    -»•  oo  (2) 

where  llwll  is  the  usual  Euclidian  norm  of  w  in  R  .   Cacoullos  (1966) 
estimated  the  joint  density  f(x.,  j=l,...,m)  at  point  x.   by 

-1  m   -1   n 
f  (x10,...,x   )  =  n   (n  h.   )  Z  k(w,  ,...,w   )         (3) 
n   10      mo        ,   j     ,     It      mt 

where  h.  is  a  sequence  of  positive  numbers  ("window  widths")  tending 
to  zero  as  n  tends  to  infinity  and 

*7   =  (x.  -x.  )/h..  (4) 

it     jt   jo   j 

Following  Singh  (1981)  and  Ullah  and  Singh  (1985)  we  use,  without  the 
loss  of  generality,  a  joint  kernel  function  k  in  (3)  which  is  a  pro- 
duct of  m  kernel  functions  k(w  ),  where  each  k(w  )  satisfies  (2).   The 

j  j 

particular  choice  of  such  kernel  considered  in  Section  2. 1  below  is 

k(w.  )  =  (2TT)"1/2exp[(-l/2)w2  ],  (5) 

Jt  r        jt 

which  is  proportional  to  the  normal  density. 
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Let  y  be  denoted  by  x  with  m  =  p  +  1.   Then  the  estimate  of  the 

m 

marginal  density  f(x,,...,x  )  at  x, , ..., x   ,  can  be  obtained  by 
°  1      p      10      po 

integrating  out  the  variable  y  =  x  .   Further,  the  estimate  of  the 

conditional  mean  E(y:x, ,...,x  )  or  the  regression  function  R  in  (1), 

1      P 

at  x, _, • . . ,x   ,  is 
10      po 

R  (x._,...,x   )  =  R  =  /x  £(x   ,...,x   )/f(x   ,...,x   )dx 
n   10      po     n     mo    10      mo     10      po   mo 

n  n 

=  E  y  r  ;  r  =  k(w  )/E  k(w  )  (6) 

P  n 

where  k(w  )  =  k(w,  ,  ...,w   )  =  IIk(w.  ).   Note  that  Er   =  1.   This  is 
t       It'     pt    j   jt  1  t 

the  Nadaraya  (1964)  and  Watson  (1964)  type  estimator.   The  con- 
sistency, asymptotic  properties  of  this  estimator  have  been  analyzed 
in  Singh  et.  al. ,  (1987)  and  Bierens  (1987);  also  see  Schuster  (1972) 
and  Rao  ((1983),  Ch.  4)  for  the  special  case  p  =  1. 

We  turn  next  to  the  nonparametric  estimation  of  the  partial  deri- 
vatives of  R  with  respect  to  x  .   The  analytical  expression  for  these 

J 

derivatives  follows  from  (6)  as 


9R  /8x.   =  Ey  (K,  -K_  )  (7) 

n   jo   .    t   It   2t 


where 


Ku  =  ^v/^v  (8) 

n       n      _2 
K   =  k(w  )  Ek'(w  )(Ek(w  ))  (9) 

1       1    t 

k'(w  )  =  3k(w  )/8x.   =  w.  h._1k(w  )  (10) 

t        t    jo    jt  j     t 
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where  the  second  equality  in  (10)  is  true  only  for  the  normal  kernel 
and  k(w)  is  as  given  in  (6).   The  asymptotic  properties  of  the  estima- 
tor in  (7)  follow  directly  from  the  asymptotic  results  in  Singh  et. 
al_.  ,  (1987)  and  Schuster  (1972),  and  they  are  not  reproduced  here  for 
the  sake  of  space.   The  interested  reader  may  see  the  unpublished 
reports  by  Vinod  and  Ullah  (1987),  and  Rilstone  (1987)  and  Rilstone 
and  Ullah  (1987)  which  use  numerical  derivative  instead  of  analytical 
expressions  used  in  (7).   We  do,  however,  briefly  outline  the  main 
results. 

Using  (1)  in  (6)  we  observe  that 

n  n  n 

R   =  ER(x,  ,...,x   )r   +  Eu  r   =  R  +  Eu  r   +  o  (1)      (11) 
n       It'     pt   t     t  t         t  t    p 

and,  therefore,  for  large  n 

=  Eu  (K,  -K   )  (12) 


3x.     3x.     ,  t   It   2t 
jo     jo    1 

where  the  second  equality  in  (11)  follows  by  using  the  Taylor  series 

expansion  of  R(x,  ,...,x   )  around  x.  ,...,x   ;  o  (1)  represents  terms 
It      pt  lo      po    p 

tending  to  zero  in  probability.   Since  the  expectation  of  u  con- 
ditional on  x's  is  zero  according  to  (1),  the  consistency  of  the  esti- 
mator in  (7)  follows  from  (12).   For  the  asymptotic  distribution  we 
note  that,  conditional  on  x's 

3R 

_r 

CJo     jo 


z  -Ci^-£->/a1/2(x0>~h<o.i>  <13> 


n      2  2 

where,    from    (12),    A(x    )    =  E    a   (x    )(K.    -K_    )      is    the   asymptotic 

o     ,   u   t    it  i-X. 

2 
variance  of  3R/3x   conditional  on  x's  and  a  (x  )  is  the  conditional 
jo  u   t 

variance  of  u.   Since  the  conditional  distribution  of  Z  in  (13)  is 
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free  from  x's,  the  unconditional  distribution  of  Z  is  also  N(0,1). 
Note  that  Vinod  and  Ullah  (1987)  and  Rilstone  and  Ullah  (1987)  have 
considered  unconditional  variance  in  the  denominator  of  (13). 

2.1   Numerical  Example 

As  an  illustration  we  analyze  the  response  of  earnings  with 
respect  to  age.   The  partial  derivative  of  the  following  semi-log  non- 
parametric  regression  model  provides  the  desired  answer.   Note  that 
economists  are  specifically  interested  in  the  estimates  of  the  par- 
tial derivatives,  not  the  regression  coefficients  on  a  linear  model, 
per  se. 

Suppose  y  is  the  logarithm  of  earnings  and  x  is  the  age  of  the 

th 
t    individual.   For  simplicity  in  illustration,  we  assume  schooling 

to  be  constant.   Now  the  nonparametric  specification  of  the  model  is 


yt  -  R(X(.)  +  ut  -  E(yt:xt)  +  ut  (14) 

which  is  a  special  case  of  (1)  for  p  =  1.   Note  that  the  estimate  of  R 
and  its  partial  derivative  with  respect  to  x  can  be  calculated  by 
using  (6)  and  (7),  respectively.   For  the  calculations  the  kernel  used 

was  as  given  in  (5),  and  following  Singh  et.  al. ,  (1987)  and  Rao 

-1/5        2 
(1983,  pp.  65-67)  the  window-width  h  taken  was  sn     where  s  = 

n    -  2 

E(x  -x)  /n.   Further,  the  conditional  variance  of  the  partial  deriva- 

1   C 

tive  A(x  )  in  (13)  was  obtained  by  using  its  consistent  estimator 

n  -2  2       -2       n  -2 

£  a  (x  ) (K   -K   )  where  a  (x  )  =  E  u  r   is  the  weighted  residual  sum 

•      u     t        It     2t  ut,tt  & 

of    squares    (RSS)    based   on   the   nonparametric    residual   u   =  y-R   • 
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In  the  extensive  labor  econometrics  literature  the  parametric  spe- 
cification of  the  model  is 

R  =  E(y  :x  )  =  a  +  6x  +  yx  ,  (15) 

yjt      t  t     t 

see  Heckman  and  Polachek  (1974)  and  Mincer  (1974)  among  others.   In 
this  model  the  estimate  of  the  partial  derivative  is 

8  +  2yxt  (16) 

and  its  variance,  conditional  on  x's,  is  given  by 

V(6)  +  4x^V(y)  +  4xtcov(8,Y),  (17) 

where  8  and  y  are  the  respective  estimates  of  8  and  y. 

For  the  purpose  of  calculations,  we  considered  Canadian  data  (1971 
Canadian  Census  Public  Use  Tapes)  on  205  individuals'  ages  and  their 
earnings.   These  individuals  were  educated  to  grade  13.   Below  we  pre- 
sent the  parametric  estimates  based  on  ordinary  least  squares  (OLS) 
and  our  nonparametric  estimates  of  the  partial  derivatives. 

Partial  Derivative   St.  Error   RSS 

Nonparametric  Estimation  .0162  .002      63.54 

Parametric  OLS  Estimation         .0189  .003      63.60 

The  OLS  estimation  of  the  parametric  model  (15)  is 

y  =  10.041  +  .173x  -  .002X2 

(.518)  (.027)   (.003)  (18) 

where  the  numbers  in  parentheses  are  standard  errors. 
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The  nonparametric  and  parametric  partial  derivative  estimates 
given  above  are  the  means  of  the  respective  partial  derivative  estima- 
tes at  various  ages.   These  estimates  and  their  corresponding  RSS  are 
quite  similar,  although  the  nonparametric  standard  error  is  smaller. 
This  is  not  surprising  since  for  the  data  under  consideration  the 
quadratic  parametric  specification  in  (15)  is  fairly  close  to  the  true 
nonparametric  specification,  see  Ullah  (1985).   The  nonparametric  spe- 
cification in  Ullah  (1985),  however,  indicates  a  "dip"  around  the  mean 
age  with  the  result  that  when  the  nonparametric  partial  derivative  was 
calculated  at  the  mean  age  of  39,  it  gave  the  value  -.008,  while  the 
parametric  estimate  remained  the  same  as  expected. 

Another  point  to  be  noted  is  that  if  an  investigator  incorrectly 

specifies  the  parametric  model  (15)  as  R  =  a  +  8x^  +  u  ,  then  the 

t     t 

partial  derivative  estimate  will  be  approximately  .011  which  will  be 
away  from  the  OLS  value  above  of  .0189.   Since  this  bias  problem  due 
to  the  misspecif ication  of  the  functional  form  does  not  arise  in  the 
nonparametric  approach,  applied  econometricians  may  find  nonparametric 
estimation  attractive  in  a  variety  of  other  problems.   Of  course,  cer- 
tain modern  "tests"  on  specification  may  reveal  the  inadequacy  of  the 
OLS,  but  may  not  reveal  a  practical  alternative  afforded  by  our  non- 
parametric approach. 

The  potential  users  should,  however,  be  well  aware  of  the 
following  limitations  of  the  nonparametric  approach.   We  know  very 
little  about  the  reliability  of  tests  obtained  by  this  approach  in 
finite  samples.   The  standard  errors  could  be  imprecise  when  the 
number  of  regressors  is  large  and  or  sample  is  small.   Not  much  is 
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known  about  the  selection  of  window  width  in  small  samples.   For 
further  details  about  these  limitations  and  the  areas  of  future 
research,  see  Ullah  (1987). 
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